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Background: Digital retinal photography with mydriasis is the preferred modality for diabetes 
eye screening. The purpose of this study was to evaluate agreement in grading levels between 
primary and secondary graders and to calculate their sensitivity and specificity for identifying 
sight-threatening disease in an optometry-based retinopathy screening program. 
Methods: This was a retrospective study using data from 8,977 patients registered in the North 
Nottinghamshire retinal screening program. In all cases, the ophthalmology diagnosis was used 
as the arbitrator and considered to be the gold standard. Kappa statistics were used to evaluate 
the level of agreement between graders. 

Results: Agreement between primary and secondary graders was 5 1 .4% and 79.7% for detect- 
ing no retinopathy (RO) and background retinopathy (Rl), respectively. For preproliferative 
(R2) and proliferative retinopathy (R3) at primary grading, agreement between the primary and 
secondary grader was 100%. Where there was disagreement between the primary and second- 
ary grader for Rl, only 2.6% (n=41) were upgraded by an ophthalmologist. The sensitivity and 
specificity for detecting R3 was 78.2% and 98. 1%, respectively. None of the patients upgraded 
from any level of retinopathy to R3 required photocoagulation therapy. The observed kappa 
between the primary and secondary grader was 0.3223 (95% confidence interval 0.2937-0.3509), 
ie, fair agreement, and between the primary grader and ophthalmology for R3 was 0.5667 (95% 
confidence interval 0.4557-0.6123), ie, moderate agreement. 

Conclusion: These data provide information on the safety of a community optometry-based 
retinal screening program for screening as a primary and as a secondary grader. The level of 
agreement between the primaiy and secondary grader at a higher level of retinopathy (R2 and 
R3) was 100%. Sensitivity and specificity for R3 were 78.2% and 98.1%, respectively. None 
of the false-negative results required photocoagulation therapy. 
Keywords: retinopathy, screening, public health, community, optometry, diabetes 

Introduction 

Diabetic retinopathy is a highly specific microvascular complication of diabetes and 
the leading cause of blindness in people under the age of 60 years in industrialized 
countries.'^ Data from the Early Treatment of Diabetic Retinopathy Study showed that 
early laser treatment would be more than 90% effective in preventing blindness,'' and 
as such, early detection of sight-threatening disease is crucial in preventing blindness 
in this group of patients. To this end, previous studies have shown the effectiveness of 
diabetes eye screening programs to prevent blindness in patients with diabetes.^"' The 
United Kingdom National Screening Committee therefore recommended a systematic 
population screening program'" which was implemented in 2003 . As a result, the current 
National Health Service (NHS) Diabetic Eye Screening Programme is in place." 
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Digital retinal photography with mydriasis is the preferred 
modality for diabetic eye screening based on its reported val- 
ues for sensitivity and specificity,'^"'^ and its ability to quality 
assure screening standards.""" This modality of retinopathy 
screening fulfils the Exeter minimum standard for sensitiv- 
ity and specificity of 80% and 95%, respectively, for robust 
and safe diabetic retinopathy screening.'^" Conventionally, 
this utilizes technicians to perform the primary grading, with 
secondary grading performed by more experienced screeners 
or clinicians, and arbitration grading performed by an oph- 
thalmologist or a diabetologist with expertise in diabetic retin- 
opathy screening. However, in selected screening programs, 
primary and secondary gradings are performed by trained 
opticians. Whilst data are available on the effectiveness of 
individual screening modalities, "'"'^■'^"" there is currently 
only one study that has looked at the interobserver agreement 
between primary graders and an expert grader.^" Information 
on the safety, effectiveness, and agreement between primary 
and secondary graders for images of patients undergoing rou- 
tine diabetic eye screening in a community optometry-based 
retinopathy screening program has not yet been reported. 

Materials and methods 

The North Nottinghamshire diabetic retinopathy screening 
service has utilized an optometry-based model since April 
2006 and involves 36 optometrists across 21 sites. Screen- 
ing is undertaken by local optometrists, and two-field digital 
images of the retina are recorded in the database and graded. 
All models and makes of the retinal cameras in use, as well 
as their age, are approved based on criteria set by the NHS 
Diabetic Eye Screening Programme. Tropicamide 1% is 
used to dilate the pupils to an acceptable size for screening, 
which is performed according to a standard national screen- 
ing protocol. Primary and secondary grading is carried out 
by optometrists on the digital retinal images, and a web- 
based referral to an ophthalmologist is required if there is 
disagreement between primary and secondary graders or if 
sight-threatening retinopathy is observed. 

For this study, data were collected retrospectively 
between January 2011 and December 2011 from a cohort 
of 8,977 patients registered in an optometry-based retinal 
screening program database currently in place in North 
Nottinghamshire. These patients were reviewed by optom- 
etrists who carried out digital retinal photography. Images 
were stored in a web-based database and graded according 
to the national screening standard." Grading levels were as 
follows: no retinopathy (RO), background retinopathy (Rl), 
preproliferative retinopathy (R2), proliferative retinopathy 
(R3), and maculopathy (Ml). Any retinopathy detected by 



a primary grader (Rl, R2, Ml) and 10% of images with no 
evidence of retinopathy (RO) was sent for secondary grading 
performed by another optometrist. If there was any disagree- 
ment between the primary and secondary grader, the images 
were sent to arbitration, which was performed by an oph- 
thalmologist. The presence of proliferative retinopathy (R3) 
would require an urgent referral to ophthalmology. However, 
during 201 1, due to an internal quality audit that was being 
undertaken, all patients with R 1 were referred to the ophthal- 
mologist for screening. Retinal images that were not gradable 
by the primary grader for reasons such as previous surgery or 
cataracts were referred directly to ophthalmology. Patients 
under ophthalmology follow-up were kept under ophthal- 
mology review with follow-up appointments until their 
retinopathy was stable. The screening program also has in 
place a fail-safe mechanism (monitored by a fail-safe officer) 
whereby images of patients subsequently found to have R3 or 
have undergone photocoagulation therapy are traced back to 
see whether this was missed during screening on an ongoing 
basis. No R3 was being missed at screening during the period 
of this audit. Once the patients had stable retinopathy with 
no immediate intervention required, they were referred back 
into the local retinal screening recall process. 

We calculated the agreement between the primary and 
secondary grader as well as between individual graders and 
ophthalmologists by means of Kappa statistics.^' We also 
looked at the proportion of disagreement leading to an upgrad- 
ing of the retinopathy level. Assessment of sensitivity and 
specificity values in this study was limited to images graded 
as R3, since all R3 are referred to an ophthalmologist for arbi- 
tration or a final grading. R3 grading from the primary grader 
was compared against the "gold standard" ophthalmological 
diagnosis. Sensitivity is calculated as the (number of true posi- 
tives/true positives + false negatives) while specificity is cal- 
culated as the (number of true negatives/true negatives + false 
positives). This work is labeled as service evaluation. The audit 
work and data derived from this work are part of the program's 
ongoing clinical governance exercise to maintain standards of 
retinopathy screening within the service. The statistical analysis 
was performed using SPSS version 14 software (SPSS Inc., 
Chicago, IL, USA). 

Results 

Of 8,977 patients (15,583 images), 734 patients were graded 
as RO by the primary grader. Of these, 377 were graded as 
RO by the secondary grader. This resulted in 5 1 .4% agree- 
ment between the primary and secondary grader for patients 
graded as RO at primary grading. The other 357 patients had 
no agreement between the primary and secondary grader. 
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From these, 4.8% (n=17) were downgraded and 3.6% (n=13) 
were upgraded by ophthalmology (Table 1). 

Background retinopathy grading (Rl) was given to 
7,784 patients by the primary grader and 1 ,448 of these were 
graded by ophthalmology. The level of agreement between 
primary and secondary graders in this group was 79.7% 
(n=6,204). Among these patients, 15.5% (n=207) of agree- 
ment was reported between the primary grader and ophthal- 
mology, while the agreement between the secondary grader 
and ophthalmology was 10.1% (n=835). For the proportion 
in which there was disagreement between the primary and 
secondary grader, 2.6% (n=41) were upgraded, of which 1% 
(n=16) were upgraded to R3 (Table 1). For the proportion 
in which there was disagreement between the primary and 
secondary grader, 0.8%) (n=13) were downgraded to a differ- 
ent grade by ophthalmology (Table 1). Where patients were 
graded R2 (n=210) at primary grading, agreement between 
the primary and secondary grader was 100% (Table 1);207 of 
the 210 that were graded as R2 by the primary grader were 
graded by the secondary grader as well as ophthalmology. 
This was due to an internal quality assurance audit that was 
taking place in 201 1. 

Proliferative retinopathy (R3) was detected in 249 patients 
by the primary grader, but only 3 1 .7% (79) of these were 
subsequently confirmed as R3 by ophthalmology. Of the 
total population screened (n=8,977), 8,728 were found not 
to have R3 by the primary grader, while 1 ,777 patients were 
confirmed by ophthalmology not to have R3. From these 
data, the sensitivity and specificity for R3 in our cohort 
is 78.2% and 98.1% (Table 1); 3.6% of normal (RO) and 
2.6% of background retinopathy (Rl) had a disagreement 
in grading, leading to an upgrading of retinopathy level by 
ophthalmology. Ten percent of images graded as RO went 



through to ophthalmology for arbitration. Of these, there was 
no agreement between the primary and secondary grader, 
but there was 56.6% agreement between the primary grader 
and ophthalmology, and 36.6% agreement between the 
secondary grader and ophthalmology. 

We used Kappa statistics to evaluate the level of agree- 
ment between primary and secondary graders and between 
primary and arbitration graders for R0-R2. There was 
an observed kappa of 0.3223 (95% confidence interval 
0.2937-0.3509) and 0.269 (95% confidence interval 
0.216-0.321), respectively (Tables 2 and 3). The level of 
agreement between the primary grader and ophthalmology 
for R3 using Kappa statistics gives an observed kappa of 
0.5667 (95% confidence interval 0.4557-0.6123). 

Discussion 

For a systematic screening program to be effective, it needs 
a database that is robust and well maintained. The system 
currently in place in North Nottinghamshire uses a central 
call/recall center with ongoing quality assurance taking place 
at all stages of the process. In addition to their professional 
qualification registered by the General Optical Council 
which regulates dispensing opticians and optometrists, all 
screeners/graders would have undertaken a certificate for 
diabetic retinopathy screening by City and Guilds, as well 
as undergoing a test training set mandated by the NHS Dia- 
betic Eye Screening Programme. During the period of the 
audit, one test training set was performed by the opticians. 
However, data for the intergrader agreement based on this 
exercise were not available. Although the national program 
recommended only 10% of RO to be secondarily screened, we 
performed an internal audit for the year 2009-2010, where 
all RO underwent secondary grading as a result of a quality 



Table I Percentage of agreement, disagreement, upgrading, and downgrading of images in the North Nottingham screening program 





RO (n=734) 
n (%) 


Rl (n=7,784) 
n (%) 


R2 (n=2IO) 
n (%) 


R3 (n=249) 
n (%) 


Agreement between primary 


377 (5 1.4%) 


6,204 (79.7%) 


210 (100%) 


249 (100%) 


and secondary grader 










Agreement between primary 


Not evaluated 


1,207 (15.5%) 


78 (37.1%) 


79 (31.7%) 


grader and ophthalmology 










Agreement between secondary 


Not evaluated 


835 (10.7%) 


78 (37.1%) 


Not evaluated 


grader and ophthalmology 










Disagreement leading to 


17 (4.8%) 


Not evaluated 


Not evaluated 


1 13 (45.4%) 


downgrading by ophthalmologist 










Disagreement leading to upgrading 


13 (3.6%) 


41 (2.6%) 


Not evaluated 


Not evaluated 


by ophthalmologist 










Disagreement leading to upgrading 


Not evaluated 


1 3 (0.8%) 


Not evaluated 


Not evaluated 


to R3 by ophthalmologist 










Notes: Using Kappa statistics to evaluate agreement between primary grader and ophthalmology for R3, the observed K is 0.57 (95% confidence interval 0.46-0.61), 
ie, moderate agreement. Sensitivity and specificity for detecting R3 are 78.2% and 98. 1 %, respectively. 

Abbreviations: RO, no retinopathy; Rl, background retinopathy; R2, preproliferative retinopathy: R3, proliferative retinopathy. 
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Table 2 Agreement and disagreement for primary grader (horizontal 
axis) and secondary grader (vertical axis) 





RO 


Rl 


R2 


RO 


17 


185 


6 


Rl 


12 


1,207 


122 


R2 


0 


36 


78 



Notes: Using Kappa statistics to evaluate overall level of agreement betv/een 
primary and secondary graders for R0-R2, the observed K is 0.3223 (95% confidence 
interval 0.2937-0.3509). 

Abbreviations: RO, no retinopathy: R I , background retinopathy: R2, preproliferative 
retinopathy. 



assurance exercise recommended by the NHS Retinopathy 
Screening Programme. No sight-threatening retinopathy 
(R2 or higher) was identified. 

The above study provides novel information on the safety 
and effectiveness of a community-based retinal screening 
program that uses optometrists at both the primary and 
secondary grader level compared with other optometry or 
nonoptometry-based programs that use senior graders, dia- 
betologists, or ophthalmologists as secondary graders. 

Evidence for the effectiveness of screening is based on 
evidence of treatment efficacy especially after early detection 
and on cost-effectiveness. Comparing this screening program 
with the Exeter standards,'*-" ours achieved a specificity 
level above the expected 95% but the sensitivity level was 
marginally short of the recommended 80% threshold. Of 
note, the sensitivity data here refer to data analysis specific 
to R3 rather than data from the whole program. Moreover, it 
is conceivable that the slightly higher level of false-positives 
observed here reflects a slightly overcautious approach by 
optometrists to grading in patients with a higher likelihood of 
abnormalities in their eyes. In addition, image arbitration was 
performed by an ophthalmologist who may decide on the final 
"grade" based on clinical need for photocoagulation therapy 
rather than actual reporting of the images. Nevertheless, the 
importance of appropriate sensitivity and specificity for any 
screening modality has become more important in view of 
some recent evidence which may advocate for a different 
frequency of retinopathy screening for different individuals 
depending on the risk of retinopathy progression, based on 



Table 3 Agreement and disagreement for primary grader (horizontal 
axis) and arbitration grader (vertical axis) 





RO 


Rl 


R2 


RO 


377 


1,107 


0 


Rl 


354 


6,204 


0 


R2 


3 


261 


210 



Notes: Using kappa statistics to evaluate overall level of agreement, between 
primary and secondary graders for R0-R2, the observed K is 0.269 {95% confidence 
interval 0.216-0.321). 

Abbreviations: RO, no retinopathy; R I , background retinopathy: R2, preproliferative 
retinopathy. 



baseline and/or previous screening results.^" Despite a high 
false-negative rate, none of the false negatives required 
urgent photocoagulation therapy, which reflects a subsequent 
"clinical" diagnosis by the ophthalmologist rather than a 
misdiagnosis by the optometrist. This has been confirmed 
by regular audit of our data based on the governance struc- 
ture currently in place in our screening program. It was also 
reassuring to note that the levels of agreement between pri- 
mary and secondary graders for higher levels of retinopathy 
(R2 and R3) were both 100%. For lower levels of retinopathy, 
ie, RO and Rl, agreement between primary and secondary 
graders were lower at 51.4% and 79.7%), respectively. Of 
these, 3.6% of normal (RO) and 2.6% of background (Rl) 
retinopathy showed a disagreement in grading, leading to an 
upgrading of retinopathy level by ophthalmology, but none 
required photocoagulation therapy. 

Some limitations to this study needs to be highlighted. 
To calculate sensitivity and specificity, we analyzed data 
specific to R3 only. This was because only 10% of RO and 
some of Rl and R2 were referred to ophthalmology, whereas 
all R3 were referred to an independent ophthalmologist. 
Because of this, we were unable to look at the sensitivity 
and specificity for the whole cohort, which affects the results 
reported in our study. We used the ophthalmologist grade 
as the gold standard, so it would be important to have all 
retinopathy graded as R2 by the primary grader reviewed 
by ophthalmology to ensure that none of these would need 
to be upgraded to R3, which would mean they will need 
ophthalmology follow-up and potential treatment. The study 
was carried out by retrospective data collection, which would 
also be considered as a limitation, due to the presence of 
confounding biases. We were also not able to reliably deter- 
mine results for maculopathy within our program. Further, 
we were not able to accurately adjust results for ungradable 
images, due to poor patient compliance with the screening 
protocol, poor mydriasis, or other factors. Interpretation of 
the results is limited to this program and cannot necessarily 
be generalized to other programs. Lastly, although Kappa 
statistics is a recognized method for assessment of agree- 
ment, the magnitude of kappa reflecting adequate agreement 
is unclear. However, arbitrary guidelines are available to 
indicate level of agreement, although these are not evidence- 
based. Generally, however, it is accepted that a kappa 
score >80% would suggest very good agreement.^^-^' Despite 
this, due to methodological limitations of other research in 
this area, and due to a lack of data and evidence of optom- 
etrists as primary and secondary graders in detecting R3 in 
a retinopathy screening program, we believe data from this 
study would enhance available knowledge concerning the 
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safety and effectiveness of an optometry community-based 
retinopathy screening program. 

There is no clear evidence suggesting who has the best 
sensitivity and specificity for detecting sight-threatening 
retinopathy, ie, whether it is independent graders, optom- 
etrists, diabetologists, general practitioners, or ophthal- 
mologists. A single study showed that retinal photographs 
assessed by optometrists could achieve >91% sensitivity in 
detecting R3 or sight-threatening retinopathy.^" Data on the 
effectiveness of individual screening modalities are widely 
available. "■"■"•^^ However, our study provides unique data 
on the safety, effectiveness, and agreement between primary 
and secondary graders for images of patients undergoing 
routine diabetes eye screening in a community optometry- 
based retinopathy screening program. 
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